Automated Mood Quantification 1 Running head: AUTOMATED MOOD QUANTIFICATION Automated Mood Quantification of Contemporary Music

نویسنده

  • Chris Cooke
چکیده

This study investigated the possibility of automatically quantifying mood according to two parameters, energy and tension, with values similar to those that would be assigned by humans. Humans completed an Activation-Deactivation Adjective Check List on fifty 20-second music clips with moods varying across samples but uniform within samples. Human measurement of energy and tension were calculated from the survey results. Thirty features relevant to mood were extracted from each of the clips using the Marsyas framework. The features were used as input to neural networks in a variety of configurations using the extracted features paired with the centroid of the human measurements to create training data. Most configurations performed slightly worse than human performance in the median, and with significantly higher dispersion. One configuration had better median performance than human evaluation with acceptable dispersion between the first and third quartile, but a large dispersion across the entire set of results. While these initial results are promising, further refinement will be needed to apply the techniques to real-world systems. © 2006 Chris Cooke, All rights reserved. Automated Mood Quantification 3 AUTOMATED MOOD QUANTIFICATION OF CONTEMPORARY MUSIC Introduction When someone sits down to listen to music, how does he decide what to play? In some cases one may have a particular musician or genre in mind, but sometimes one wants to hear music of a certain mood. Since mood varies within a genre or the work of a particular musician, play list selection in this case can be problematic unless the listener has extensive knowledge of the available library holdings. Consider the case of an online music site with hundreds of independent artists and thousands of songs, with more of each being added daily. The ability to build a play list based on mood under such conditions seems hopeless without automation. Music information retrieval based on metadata such as artist name and song title has become routine. The chief problem is associating the metadata with a track in the first place, but with the advent of ID tags and databases such as the Gracenote database (Gracenote, 2006), this issue is solved even for the music of serious amateurs. Less work has been done in extracting information from the audio content of the song itself. Research into automatic genre classification shows promising results, but little as been done to detect mood. Feng et al. (2003) used tempo features to classify mood into four basic categories. Liu et al. (2003) used a more sophisticated set of audio features to classify acoustic music clips into one of four moods derived from Thayer’s model for mood and arousal. Thayer (1989) proposed a model for mood that consists, in its basic form, of two dimensions of arousal, energy and tension. Liu’s team quantized a planar segment of these dimensions into four distinct moods by pairing high vs. low energy with high vs. low tension (Liu’s paper uses the term “stress”.) Liu’s team had some success automatically classifying 20-second sound clips into one of these four © 2006 Chris Cooke, All rights reserved. Automated Mood Quantification 4 categories using spectral and time-based features as input to a hierarchical Gaussian Mixture Model, which used probabilistic methods to build a decision tree for classification. While Liu’s work represents a good first step, one might ask if a four-bin classification is too broad for practical play list construction based on mood. Rather than classifying a song into one of four moods, there might be some value in describing the degree of energy and tension arousal with some metric. Such a measure could then be associated with the audio file and be used to match a similar metric derived from the listener’s mood, perhaps using a distance algorithm or a fuzzy expert system. The central question, then is given a set of extracted features, can a machine-learning algorithm create a measure of energy and tension arousal that is comparable to that of a human listening to the same clip?

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-quantitative segmental perfusion scoring in myocardial perfusion SPECT: visual vs. automated analysis

Introduction: It is recommended that the physician apply at least a semi-quantitative segmental scoring system in myocardial perfusion SPECT.  We aimed to assess the agreement between automated semi-quantitative analysis using QPS (quantitative Perfusion SPECT) software and visual approach for calculation of summed stress  score (SSS), summed rest score (SRS) and summed difference score (SDS). ...

متن کامل

Exploring Mood Metadata: Relationships with Genre, Artist and Usage Metadata

There is a growing interest in developing and then evaluating Music Information Retrieval (MIR) systems that can provide automated access to the mood dimension of music. Mood as a music access feature, however, is not well understood in that the terms used to describe it are not standardized and their application can be highly idiosyncratic. To better understand how we might develop methods for...

متن کامل

Music Mood Classification of Television Theme Tunes

This paper introduces methods used for Music Mood Classification to assist in the automated tagging of television programme theme tunes for the first time. The methods employed use a knowledge driven approach with tailored parameters extractable from the Matlab MIR Toolbox [1]. Four new features were developed, three based on tonality and one on tempo, to enable a degree of quantified tagging, ...

متن کامل

A method for Music Classification based on Perceived Mood Detection for Indian Bollywood Music

A lot of research has been done in the past decade in the field of audio content analysis for extracting various information from audio signal. One such significant information is the ”perceived mood” or the ”emotions” related to a music or audio clip. This information is extremely useful in applications like creating or adapting the play-list based on the mood of the listener. This information...

متن کامل

Automated Detection of Multiple Sclerosis Lesions Using Texture-based Features and a Hybrid Classifier

Background: Multiple Sclerosis (MS) is the most frequent non-traumatic neurological disease capable of causing disability in young adults. Detection of MS lesions with magnetic resonance imaging (MRI) is the most common technique. However, manual interpretation of vast amounts of data is often tedious and error-prone. Furthermore, changes in lesions are often subtle and extremely unrepresentati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006